The Limiting Distribution for the Number of Symbol Comparisons Used by Quicksort Is Nondegenerate
نویسندگان
چکیده
In a continuous-time setting, Fill [2] proved, for a large class of probabilistic sources, that the number of symbol comparisons used by QuickSort, when centered by subtracting the mean and scaled by dividing by time, has a limiting distribution, but proved little about that limiting random variable Y —not even that it is nondegenerate. We establish the nondegeneracy of Y . The proof is perhaps surprisingly difficult. 1. The number of symbol comparisons used by QuickSort: Brief review of a limiting-distribution result In this section we briefly review the main theorem of [2]. An infinite sequence of independent and identically distributed keys is generated; each key is a random word (w1, w2, . . .) = w1w2 · · · , that is, an infinite sequence, or “string”, of symbols wi drawn from a totally ordered finite alphabet Σ. The common distribution μ of the keys (called a probabilistic source) is allowed to be any distribution over words, i.e., the distribution of any stochastic process with time parameter set {1, 2, . . . } and state space Σ . We know thanks to Kolmogorov’s consistency criterion (e.g., Theorem 3.3.6 in [1]) that the possible distributions μ are in one-to-one correspondence with consistent specifications of finite-dimensional marginals, i.e., of the fundamental probabilities (1.1) pw := μ({w1w2 · · ·wk} × Σ∞) with w = w1w2 · · ·wk ∈ Σ∗. This pw is the probability that a word drawn from μ has w as its length-k prefix. For each n, Hoare’s [6] QuickSort algorithm can be used to sort the first n keys to be generated. We may and do assume that the first key in the sequence is chosen as the pivot, and that the same is true recursively (in the sense, for example, that the pivot used to sort the keys smaller than the original pivot is the first key to be generated that is smaller than the original pivot). A comparison of two keys is done by scanning the two words from left to right, comparing the symbols of matching index one by one until a difference is found. We let Sn denote the total number of symbol comparisons needed when n keys are sorted by QuickSort. Theorem 1.1 (Fill [2], Theorem 3.1). Consider the continuous-time setting in which keys are generated from a probabilistic source at the arrival times of an independent Poisson process N with unit rate. Let S(t) = SN(t) denote the number Date: January 27, 2012. Research supported by the Acheson J. Duncan Fund for the Advancement of Research in Statistics. 1 2 PATRICK BINDJEME JAMES ALLEN FILL of symbol comparisons required by QuickSort to sort the keys generated through epoch t, and let (1.2) Y (t) := S(t)−ES(t) t , 0 < t <∞. Assume that (1.3) ∞ ∑
منابع مشابه
The limiting distribution for the number of symbol comparisons used by QuickSort is nondegenerate (extended abstract)
In a continuous-time setting, Fill (2012) proved, for a large class of probabilistic sources, that the number of symbol comparisons used by QuickSort, when centered by subtracting the mean and scaled by dividing by time, has a limiting distribution, but proved little about that limiting random variable Y —not even that it is nondegenerate. We establish the nondegeneracy of Y . The proof is perh...
متن کاملDistributional convergence for the number of symbol comparisons used by QuickSort
Most previous studies of the sorting algorithm QuickSort have used the number of key comparisons as a measure of the cost of executing the algorithm. Here we suppose that the n independent and identically distributed (iid) keys are each represented as a sequence of symbols from a probabilistic source and that QuickSort operates on individual symbols, and we measure the execution cost as the num...
متن کاملApproximating the limiting Quicksort distribution
The limiting distribution of the normalized number of comparisons used by Quicksort to sort an array of n numbers is known to be the unique fixed point with zero mean of a certain distributional transformation S. We study the convergence to the limiting distribution of the sequence of distributions obtained by iterating the transformation S, beginning with a (nearly) arbitrary starting distribu...
متن کاملSmoothness and decay properties of the limiting Quicksort density function
Using Fourier analysis, we prove that the limiting distribution of the standardized random number of comparisons used by Quicksort to sort an array of n numbers has an everywhere positive and infinitely differentiable density f , and that each derivative f (k) enjoys superpolynomial decay at ±∞. In particular, each f (k) is bounded. Our method is sufficiently computational to prove, for example...
متن کاملQuicksort Algorithm Again Revisited
We consider the standard Quicksort algorithm that sorts n distinct keys with all possible n! orderings of keys being equally likely. Equivalently, we analyze the total path length n in a randomly built binary search tree. Obtaining the limiting distribution of n is still an outstanding open problem. In this paper, we establish an integral equation for the probability density of the number of co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012